Code Structure–Guided Transformer for Source Code Summarization

نویسندگان

چکیده

Code summaries help developers comprehend programs and reduce their time to infer the program functionalities during software maintenance. Recent efforts resort deep learning techniques such as sequence-to-sequence models for generating accurate code summaries, among which Transformer-based approaches have achieved promising performance. However, effectively integrating structure information into Transformer is under-explored in this task domain. In article, we propose a novel approach named SG-Trans incorporate structural properties Transformer. Specifically, inject local symbolic (e.g., tokens statements) global syntactic dataflow graph) self-attention module of inductive bias. To further capture hierarchical characteristics code, are designed distribute attention heads lower layers high Extensive evaluation shows superior performance over state-of-the-art approaches. Compared with best-performing baseline, still improves 1.4% 2.0% on two benchmark datasets, respectively, terms METEOR score, metric widely used measuring generation quality.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Convolutional Attention Network for Extreme Summarization of Source Code

Attention mechanisms in neural networks have proved useful for problems in which the input and output do not have fixed dimension. Often there exist features that are locally translation invariant and would be valuable for directing the model’s attention, but previous attentional architectures are not constructed to learn such features specifically. We introduce an attentional neural network th...

متن کامل

Automated feature discovery via sentence selection and source code summarization

Programs are, in essence, a collection of implemented features. Feature Discovery in software engineering is the task of identifying key functionalities that a program implements. Manual feature discovery can be time-consuming and expensive, leading to automatic feature discovery tools being developed. However, these approaches typically only describe features using lists of keywords, which can...

متن کامل

Source code transformations for efficient SIMD code generation

Despite the effort inverted the last years in commercial compilers to generate efficient SIMD instructionsbased code sequences from conventional sequential programs, the small numbers of compilers that can automatically use these instructions achieve in most cases unsatisfactory results. This work shows how exposing register level reuse in source codes helps vectorizing compilers as ICC to gene...

متن کامل

Source Code Implications for Malcode

he availability of source code for both exploits and malicious code is higher than it has ever been in the history of computing. More important, the source codes for highly successful, highprofile malicious tools are now available. Several years ago it was almost impossible to obtain the source for these worms and exploits, forcing actors to rely on minor changes to binaries to generate new var...

متن کامل

Preprocessing of Object-Oriented Source Code for Code Retrieval

Object oriented source code occurs in diverse programming languages with documentation using miscellaneous standards, comments in individual styles, or associated test cases that are hard to exploit through information retrieval or knowledge discovery techniques. Typically, the information about object-oriented source code for a software system is distributed across several different sources, w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions on Software Engineering and Methodology

سال: 2023

ISSN: ['1049-331X', '1557-7392']

DOI: https://doi.org/10.1145/3522674